Developing STT and KWS systems using limited language resources

نویسندگان

  • Viet Bac Le
  • Lori Lamel
  • Abdelkhalek Messaoudi
  • William Hartmann
  • Jean-Luc Gauvain
  • Cécile Woehrling
  • Julien Despres
  • Anindya Roy
چکیده

This paper presents recent progress in developing speech-totext (STT) and keyword spotting (KWS) systems for the 2014 IARPA-Babel evaluation. Systems have been developed for the limited language pack condition for four of the five development languages in this program phase: Assamese, Bengali, Haitian Creole and Zulu. The systems have several novel characteristics that support rapid development of KWS systems. On the STT side different acoustic units are explored based on phonemic or graphemic representations, and system combination is used to improve STT performance. The acoustic models are trained on only 10 hours of speech data with manual transcriptions, completed with unsupervised training on additional untranscribed data. Both word and subword units (morphologically decomposed, syllables, phonemes) are used for KWS. The KWS systems are based on the multi-hypotheses produced by a consensus network decoding or searching word lattices. The word error rates of the individual STT systems are on the order of 50-60%, and the KWS systems obtain Maximum Term Weighted Values ranging from 30-45% for all keywords (invocabulary and out-of-vocabulary (OOV)). Sub-word units are shown to be successful at locating some of the OOV keywords, and system combination improves system performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active learning based data selection for limited resource STT and KWS

This paper presents first results in using active learning (AL) for training data selection in the context of the IARPABabel program. Given an initial training data set, we aim to automatically select additional data (from an untranscribed pool data set) for manual transcription. Initial and selected data are then used to build acoustic and language models for speech recognition. The goal of th...

متن کامل

Data augmentation, feature combination, and multilingual neural networks to improve ASR and KWS performance for low-resource languages

This paper presents the progress of acoustic models for lowresourced languages (Assamese, Bengali, Haitian Creole, Lao, Zulu) developed within the second evaluation campaign of the IARPA Babel project. This year, the main focus of the project is put on training high-performing automatic speech recognition (ASR) and keyword search (KWS) systems from language resources limited to about 10 hours o...

متن کامل

Tunable keyword-aware language modeling and context dependent fillers for LVCSR-based spoken keyword search

We explore the potential of using keyword-aware language modeling to extend the ability of trading higher false alarm rates in exchange for lower miss detection rates in LVCSRbased keyword search (KWS). A context-dependent keyword language modeling method is also proposed to further enhance the keyword-aware language modeling framework by reducing the number of false alarms often sacrificed in ...

متن کامل

Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED

Recently there has been increased interest in Automatic Speech Recognition (ASR) and Key Word Spotting (KWS) systems for low resource languages. One of the driving forces for this research direction is the IARPA Babel project. This paper describes some of the research funded by this project at Cambridge University, as part of the Lorelei team co-ordinated by IBM. A range of topics are discussed...

متن کامل

Combining tandem and hybrid systems for improved speech recognition and keyword spotting on low resource languages

In recent years there has been significant interest in Automatic Speech Recognition (ASR) and Key Word Spotting (KWS) systems for low resource languages. One of the driving forces for this research direction is the IARPA Babel project. This paper examines the performance gains that can be obtained by combining two forms of deep neural network ASR systems, Tandem and Hybrid, for both ASR and KWS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014